gausshyper (Gauss hypergeometric) distribution#
A flexible four-parameter distribution on the unit interval ([0,1]). In SciPy it lives at scipy.stats.gausshyper.
1) Title & Classification#
Name:
gausshyper(Gauss hypergeometric distribution)Type: continuous
Support: (x \in (0,1)) (endpoints have zero probability mass; the density may diverge at 0 or 1 depending on (a,b))
Parameter space (SciPy convention): (a>0,\ b>0,\ c\in\mathbb{R},\ z>-1)
SciPy signature:
scipy.stats.gausshyper(a, b, c, z, loc=0, scale=1)
Learning goals#
understand how
gausshypergeneralizes a Beta distribution via a tilt factorrecognize where the Gauss hypergeometric function ({}_2F_1) enters (normalization + moments)
derive mean/variance from the general raw-moment expression
implement a NumPy-only sampler (rejection sampling)
use SciPy for
pdf,cdf,rvs, andfit
Table of contents#
Intuition & Motivation
Formal Definition
Moments & Properties
Parameter Interpretation
Derivations
Sampling & Simulation
Visualization
SciPy Integration
Statistical Use Cases
Pitfalls
Summary
import numpy as np
import scipy
from scipy import special
from scipy.stats import chi2
from scipy.stats import gausshyper
import plotly.graph_objects as go
import os
import plotly.io as pio
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
rng = np.random.default_rng(0)
print("python", __import__("sys").version.split()[0])
print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
import plotly
print("plotly", plotly.__version__)
python 3.12.9
numpy 1.26.2
scipy 1.15.0
plotly 6.5.2
2) Intuition & Motivation#
A helpful way to view gausshyper is as a Beta(a,b) density multiplied by a tilt:
[ f(x) \propto \underbrace{x^{a-1}(1-x)^{b-1}}{\text{Beta kernel}};\underbrace{(1+z x)^{-c}}{\text{tilt}},\qquad x\in(0,1). ]
The Beta kernel controls how the density behaves near 0 and 1.
The factor ((1+z x)^{-c}) reweights the interior:
for (z>0), (c>0): ((1+z x)^{-c}) decreases in (x) → shifts mass toward 0
for (z>0), (c<0): ((1+z x)^{-c}) increases in (x) → shifts mass toward 1
for (-1<z<0), the direction flips because (1+z x) decreases with (x).
What this distribution models#
A random probability / proportion in ([0,1]) when a plain Beta distribution is close but not flexible enough—especially if you want an extra “tilt” across the interval without changing the endpoint exponents (a-1) and (b-1).
Typical real-world use cases#
Bayesian priors for probabilities (conversion rates, reliabilities, mixture weights) with extra curvature beyond Beta.
Prior elicitation in Bayesian queueing models (Armero & Bayarri, 1994; cited by SciPy) where the tilt term arises naturally.
Relations to other distributions#
Beta(a,b) is a special case: if (c=0) (or (z=0)), then ((1+z x)^{-c}=1) and
gausshyperreduces to Beta.Related in spirit to other generalized Beta families that multiply the Beta kernel by an extra factor (e.g., exponential tilts like the Kummer beta).
3) Formal Definition#
PDF#
For (x\in(0,1)), the probability density function is
where:
(B(a,b)=\frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}) is the Beta function,
({}_2F_1) is the Gauss hypergeometric function.
Parameter space (SciPy convention): (a>0), (b>0), (c\in\mathbb{R}), (z>-1).
The constraint (z>-1) guarantees (1+z x>0) on ([0,1]).
CDF#
A standard way to write the CDF is as the defining integral
SciPy evaluates this numerically (gausshyper.cdf). For plotting, a fast approximation is to integrate the PDF on a fine grid (trapezoidal rule).
def gausshyper_norm(a: float, b: float, c: float, z: float) -> float:
# Normalization integral: ∫_0^1 x^{a-1}(1-x)^{b-1}(1+zx)^{-c} dx
return special.beta(a, b) * special.hyp2f1(c, a, a + b, -z)
def gausshyper_pdf_formula(x: np.ndarray, a: float, b: float, c: float, z: float) -> np.ndarray:
x = np.asarray(x, dtype=float)
norm = gausshyper_norm(a, b, c, z)
return (x ** (a - 1.0) * (1.0 - x) ** (b - 1.0) / (1.0 + z * x) ** c) / norm
# Sanity check: formula vs SciPy
params = dict(a=2.0, b=5.0, c=3.0, z=2.0)
xs = np.linspace(1e-6, 1 - 1e-6, 400)
err = np.max(np.abs(gausshyper.pdf(xs, **params) - gausshyper_pdf_formula(xs, **params)))
err
8.881784197001252e-16
4) Moments & Properties#
A clean way to work with moments is via raw moments (m_n = \mathbb{E}[X^n]).
Raw moments#
Using the same integral identity that normalizes the PDF, one can show (for (n>-a)):
In particular:
Mean: (\mu = m_1)
Variance: (\sigma^2 = m_2 - m_1^2)
Skewness and kurtosis#
From raw moments (m_1,\dots,m_4):
[ \mu_2 = m_2 - m_1^2, \quad \mu_3 = m_3 - 3 m_1 m_2 + 2 m_1^3, \quad \mu_4 = m_4 - 4 m_1 m_3 + 6 m_1^2 m_2 - 3 m_1^4. ]
[ \text{skewness} = \frac{\mu_3}{\mu_2^{3/2}}, \qquad \text{kurtosis} = \frac{\mu_4}{\mu_2^{2}}, \qquad \text{excess kurtosis} = \frac{\mu_4}{\mu_2^{2}} - 3. ]
SciPy computes ((\mu,\sigma^2,\text{skew},\text{kurtosis})) via gausshyper.stats(..., moments="mvsk").
MGF / characteristic function#
Because (X\in[0,1]), the MGF exists for all real (t):
[ M_X(t)=\mathbb{E}[e^{tX}] = \int_0^1 e^{tx} f(x),dx. ]
A practical representation is the power series in terms of raw moments:
[ M_X(t) = \sum_{n=0}^{\infty} m_n,\frac{t^n}{n!}. ]
The characteristic function is (\varphi_X(t)=M_X(it)).
Entropy#
The differential entropy is
[ H(X) = -\int_0^1 f(x)\log f(x),dx, ]
and is typically evaluated numerically (quadrature or Monte Carlo).
def raw_moment(n: int, a: float, b: float, c: float, z: float) -> float:
num = special.beta(a + n, b) * special.hyp2f1(c, a + n, a + b + n, -z)
den = special.beta(a, b) * special.hyp2f1(c, a, a + b, -z)
return float(num / den)
a, b, c, z = params["a"], params["b"], params["c"], params["z"]
m1 = raw_moment(1, a, b, c, z)
m2 = raw_moment(2, a, b, c, z)
m3 = raw_moment(3, a, b, c, z)
m4 = raw_moment(4, a, b, c, z)
var = m2 - m1**2
mu2 = var
mu3 = m3 - 3 * m1 * m2 + 2 * m1**3
mu4 = m4 - 4 * m1 * m3 + 6 * (m1**2) * m2 - 3 * m1**4
skew = mu3 / mu2**1.5
kurt = mu4 / mu2**2
excess_kurt = kurt - 3
print("mean (moment) ", m1)
print("var (moment) ", var)
print("skewness (moment) ", skew)
print("excess kurt (mom) ", excess_kurt)
mean_s, var_s, skew_s, kurt_s = gausshyper.stats(a, b, c, z, moments="mvsk")
print("\nSciPy stats(mvsk)")
print("mean", float(mean_s))
print("var ", float(var_s))
print("skew", float(skew_s))
print("kurt", float(kurt_s))
mean (moment) 0.2021617732371373
var (moment) 0.017617376175620106
skewness (moment) 1.0050439355317722
excess kurt (mom) 0.910103419130583
SciPy stats(mvsk)
mean 0.20216177323713735
var 0.017617376175620078
skew 1.0050439355317775
kurt 0.9101034191305897
# Numerical entropy estimate (trapezoidal integration)
xs = np.linspace(1e-6, 1 - 1e-6, 5000)
pdf = gausshyper.pdf(xs, a, b, c, z)
entropy_trapz = -np.trapz(pdf * np.log(pdf), xs)
entropy_trapz
-0.7379404086142544
# MGF via Gauss–Legendre quadrature (NumPy nodes/weights)
def mgf_legendre(t: float, a: float, b: float, c: float, z: float, n_nodes: int = 200) -> float:
nodes, weights = np.polynomial.legendre.leggauss(n_nodes)
x = 0.5 * (nodes + 1.0) # [-1,1] → [0,1]
w = 0.5 * weights
return float(np.sum(w * np.exp(t * x) * gausshyper.pdf(x, a, b, c, z)))
for t in (-4.0, -1.0, 0.0, 1.0, 4.0):
print(f"t={t:>4}: M(t)≈{mgf_legendre(t, a, b, c, z):.8f}")
t=-4.0: M(t)≈0.50168881
t=-1.0: M(t)≈0.82387791
t= 0.0: M(t)≈1.00000000
t= 1.0: M(t)≈1.23537399
t= 4.0: M(t)≈2.65727886
5) Parameter Interpretation#
You can interpret gausshyper as
[ f(x) \propto \underbrace{x^{a-1}(1-x)^{b-1}}{\text{Beta kernel}},\underbrace{(1+z x)^{-c}}{\text{tilt}}. ]
(a): controls behavior near 0.
smaller (a) puts more mass near 0; (a<1) yields an (integrable) singularity at 0.
(b): controls behavior near 1.
smaller (b) puts more mass near 1; (b<1) yields an (integrable) singularity at 1.
(c): strength/direction of the tilt.
for fixed (z>0): (c>0) pushes mass toward 0, (c<0) pushes mass toward 1.
(z): how quickly the tilt changes with (x).
as (z\to 0), the tilt disappears and the distribution approaches Beta.
# Shape changes: PDFs for several parameter settings
param_sets = [
dict(a=2, b=5, c=0, z=2, label="Beta special case (c=0)"),
dict(a=2, b=5, c=+3, z=2, label="c=+3, z=2 (tilt toward 0)"),
dict(a=2, b=5, c=-3, z=2, label="c=-3, z=2 (tilt toward 1)"),
dict(a=2, b=5, c=+3, z=-0.75, label="c=+3, z=-0.75"),
]
xs = np.linspace(1e-4, 1 - 1e-4, 500)
fig = go.Figure()
for p in param_sets:
fig.add_trace(
go.Scatter(
x=xs,
y=gausshyper.pdf(xs, p["a"], p["b"], p["c"], p["z"]),
mode="lines",
name=p["label"],
)
)
fig.update_layout(
title="PDF shapes for gausshyper",
xaxis_title="x",
yaxis_title="density",
)
fig.show()
6) Derivations#
Expectation / variance (via raw moments)#
Start from the PDF:
[ f(x)=\frac{1}{B(a,b),{}_2F_1(c,a;a+b;-z)};x^{a-1}(1-x)^{b-1}(1+z x)^{-c}. ]
For (n\ge 0), the raw moment is
[ \mathbb{E}[X^n] =\frac{1}{B(a,b),{}_2F_1(c,a;a+b;-z)}\int_0^1 x^{a+n-1}(1-x)^{b-1}(1+z x)^{-c},dx. ]
The integral is the same normalization integral with (a\to a+n), so
[ \mathbb{E}[X^n] =\frac{B(a+n,b)}{B(a,b)},\frac{{}_2F_1(c,a+n;a+b+n;-z)}{{}_2F_1(c,a;a+b;-z)}. ]
Then
[ \mathbb{E}[X]=m_1, \qquad \mathrm{Var}(X)=m_2-m_1^2. ]
Likelihood#
Given i.i.d. data (x_1,\dots,x_N\in(0,1)) and loc=0, scale=1, the log-likelihood is
[ \ell(a,b,c,z) = -N\log B(a,b) - N\log {}_2F_1(c,a;a+b;-z)
(a-1)\sum_i \log x_i
(b-1)\sum_i \log(1-x_i)
c\sum_i \log(1+z x_i). ]
SciPy’s fit performs numerical maximum likelihood estimation.
def gausshyper_logpdf_formula(x: np.ndarray, a: float, b: float, c: float, z: float) -> np.ndarray:
x = np.asarray(x, dtype=float)
norm = gausshyper_norm(a, b, c, z)
return (a - 1.0) * np.log(x) + (b - 1.0) * np.log(1.0 - x) - c * np.log1p(z * x) - np.log(norm)
xs_test = np.linspace(1e-6, 1 - 1e-6, 20)
max_logpdf_err = np.max(
np.abs(gausshyper.logpdf(xs_test, a, b, c, z) - gausshyper_logpdf_formula(xs_test, a, b, c, z))
)
max_logpdf_err
1.7763568394002505e-15
7) Sampling & Simulation#
NumPy-only rejection sampler#
Because
[ f(x) \propto \text{Beta}(a,b)\times (1+z x)^{-c}, ]
we can use rejection sampling with a Beta proposal:
Propose (X\sim\text{Beta}(a,b)).
Accept with probability (\frac{(1+zX)^{-c}}{M}), where
[ M=\max_{x\in[0,1]}(1+z x)^{-c} = \max\big(1,\ (1+z)^{-c}\big). ]
Key point: this does not require computing ({}_2F_1) or the Beta function.
def rvs_gausshyper_numpy(
size: int,
a: float,
b: float,
c: float,
z: float,
rng: np.random.Generator | None = None,
batch: int = 20_000,
return_acceptance: bool = False,
):
# Sample from gausshyper(a,b,c,z) on (0,1) using NumPy-only rejection sampling.
# Proposal: X ~ Beta(a,b)
# Accept with prob: (1+zX)^(-c) / M, where M = max(1, (1+z)^(-c))
if rng is None:
rng = np.random.default_rng()
size = int(size)
if size < 0:
raise ValueError("size must be non-negative")
if not (a > 0 and b > 0 and z > -1 and np.isfinite(c)):
raise ValueError("invalid parameters: require a>0, b>0, z>-1, c finite")
if size == 0:
out = np.array([], dtype=float)
return (out, 1.0) if return_acceptance else out
logM = max(0.0, -c * np.log1p(z))
out = np.empty(size, dtype=float)
filled = 0
proposals = 0
while filled < size:
x = rng.beta(a, b, size=batch)
u = rng.random(batch)
logw = -c * np.log1p(z * x)
accept = np.log(u) < (logw - logM)
accepted = x[accept]
n_take = min(size - filled, accepted.size)
out[filled : filled + n_take] = accepted[:n_take]
filled += n_take
proposals += batch
if proposals > 100_000_000:
raise RuntimeError("Too many proposals; acceptance rate is extremely low.")
accept_rate = size / proposals
return (out, accept_rate) if return_acceptance else out
samples_np, acc = rvs_gausshyper_numpy(20_000, a, b, c, z, rng=rng, return_acceptance=True)
acc
0.25
# Compare NumPy-only sampling to SciPy moments
mean_s, var_s, skew_s, kurt_s = gausshyper.stats(a, b, c, z, moments="mvsk")
print("NumPy sampler acceptance rate:", acc)
print("sample mean:", float(np.mean(samples_np)))
print("sample var :", float(np.var(samples_np)))
print("theory mean:", float(mean_s))
print("theory var :", float(var_s))
NumPy sampler acceptance rate: 0.25
sample mean: 0.2027499903270397
sample var : 0.01751052124828153
theory mean: 0.20216177323713735
theory var : 0.017617376175620078
8) Visualization#
We’ll visualize:
PDF
CDF (numerical integration on a grid)
Monte Carlo samples (histogram + empirical CDF)
# PDF + Monte Carlo histogram
xs = np.linspace(1e-4, 1 - 1e-4, 600)
pdf = gausshyper.pdf(xs, a, b, c, z)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=samples_np,
nbinsx=60,
histnorm="probability density",
name="NumPy samples (hist)",
opacity=0.6,
)
)
fig.add_trace(go.Scatter(x=xs, y=pdf, mode="lines", name="SciPy PDF", line=dict(width=3)))
fig.update_layout(title="gausshyper: PDF and Monte Carlo samples", xaxis_title="x", yaxis_title="density")
fig.show()
# CDF: integrate PDF on a grid (trapezoidal rule)
dx = xs[1] - xs[0]
# cumulative trapezoid without scipy.integrate
cdf = np.concatenate([[0.0], np.cumsum(0.5 * (pdf[:-1] + pdf[1:]) * dx)])
cdf = cdf / cdf[-1] # enforce ending at 1 (small numerical drift)
# Empirical CDF from samples
x_sorted = np.sort(samples_np)
ecdf = np.arange(1, x_sorted.size + 1) / x_sorted.size
fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=cdf, mode="lines", name="CDF (integrated PDF)", line=dict(width=3)))
fig.add_trace(go.Scatter(x=x_sorted, y=ecdf, mode="lines", name="Empirical CDF (samples)", line=dict(dash="dot")))
# A few SciPy CDF evaluations as a spot-check
x_pts = np.array([0.05, 0.2, 0.5, 0.8, 0.95])
fig.add_trace(
go.Scatter(
x=x_pts,
y=gausshyper.cdf(x_pts, a, b, c, z),
mode="markers",
name="SciPy CDF (points)",
marker=dict(size=9),
)
)
fig.update_layout(title="gausshyper: CDF", xaxis_title="x", yaxis_title="F(x)")
fig.show()
9) SciPy Integration#
scipy.stats.gausshyper provides:
pdf,logpdfcdf,ppf(computed numerically)rvsmoment,statsfit(MLE)
Below is a minimal “cookbook” usage.
# Basic usage
x0 = 0.3
print("pdf:", gausshyper.pdf(x0, a, b, c, z))
print("cdf:", gausshyper.cdf(x0, a, b, c, z))
# Random variates
samples_scipy = gausshyper.rvs(a, b, c, z, size=10_000, random_state=rng)
# Moments / stats
print("stats (mvsk):", gausshyper.stats(a, b, c, z, moments="mvsk"))
print("moment(3):", gausshyper.moment(3, a, b, c, z))
pdf: 1.627632642063357
cdf: 0.7906632180054841
stats (mvsk): (0.20216177323713735, 0.017617376175620078, 1.0050439355317775, 0.9101034191305897)
moment(3): 0.021297063942282413
# MLE fitting (fix loc/scale to stay on [0,1])
data_fit = gausshyper.rvs(a, b, c, z, size=2_000, random_state=rng)
(a_hat, b_hat, c_hat, z_hat, loc_hat, scale_hat) = gausshyper.fit(data_fit, floc=0, fscale=1)
print("fitted shapes:", (a_hat, b_hat, c_hat, z_hat))
print("loc/scale:", (loc_hat, scale_hat))
fitted shapes: (1.9764016414539523, 4.079528064526103, 15.333843481262363, 0.37090878984369335)
loc/scale: (0, 1)
10) Statistical Use Cases#
10.1 Hypothesis testing#
A simple nested test fixes (z) and compares:
Null: (c=0) (reduces to a Beta distribution)
Alternative: (c) free
With (z) fixed, this is a 1-degree-of-freedom likelihood ratio test (as an approximation).
10.2 Bayesian modeling#
Let (p\in(0,1)) be a success probability.
Prior: [ p \sim \text{gausshyper}(a,b,c,z) ]
Binomial likelihood: [ y\mid p \sim \text{Binomial}(n,p) ]
Posterior is in the same family: [ p\mid y \sim \text{gausshyper}(a+y,\ b+n-y,\ c,\ z). ]
This mirrors Beta–Binomial conjugacy; the tilt term ((1+z p)^{-c}) stays unchanged.
10.3 Generative modeling#
A simple hierarchical model: [ p_i \sim \text{gausshyper}(a,b,c,z),\qquad y_i\mid p_i \sim \text{Binomial}(m,p_i). ]
The induced marginal distribution of counts (y_i) is typically more flexible than a Beta–Binomial because the prior over (p_i) is more flexible.
# 10.1 Likelihood ratio test: H0 (c=0) vs H1 (c free), with z fixed
z_fixed = z
x = gausshyper.rvs(a, b, c, z_fixed, size=1_500, random_state=rng)
# Null: c=0, z fixed
(a0, b0, c0, z0, loc0, scale0) = gausshyper.fit(x, fc=0, fz=z_fixed, floc=0, fscale=1)
ll0 = float(np.sum(gausshyper.logpdf(x, a0, b0, c0, z0)))
# Alternative: c free, z fixed
(a1, b1, c1, z1, loc1, scale1) = gausshyper.fit(x, fz=z_fixed, floc=0, fscale=1)
ll1 = float(np.sum(gausshyper.logpdf(x, a1, b1, c1, z1)))
lr = 2 * (ll1 - ll0)
p_value = 1 - chi2.cdf(lr, df=1)
print("H0 fit (a,b,c,z):", (a0, b0, c0, z0))
print("H1 fit (a,b,c,z):", (a1, b1, c1, z1))
print("LL0:", ll0)
print("LL1:", ll1)
print("LR statistic:", lr)
print("approx p-value (chi^2_1):", p_value)
H0 fit (a,b,c,z): (1.721573236948028, 6.708069116289048, 0, 2.0)
H1 fit (a,b,c,z): (2.038367704355453, 4.675183494500131, 3.453529357407323, 2.0)
LL0: 1092.9216112905583
LL1: 1097.7549215034153
LR statistic: 9.666620425713973
approx p-value (chi^2_1): 0.0018764617850800525
# 10.2 Bayesian update for Binomial: prior/posterior densities
# Prior parameters
prior = dict(a=2.0, b=5.0, c=3.0, z=2.0)
# Binomial data
n = 40
y = 12
post = dict(a=prior["a"] + y, b=prior["b"] + (n - y), c=prior["c"], z=prior["z"])
xs = np.linspace(1e-4, 1 - 1e-4, 600)
fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=gausshyper.pdf(xs, **prior), mode="lines", name="prior"))
fig.add_trace(go.Scatter(x=xs, y=gausshyper.pdf(xs, **post), mode="lines", name="posterior"))
fig.update_layout(
title=f"Conjugate update for Binomial: n={n}, y={y}",
xaxis_title="p",
yaxis_title="density",
)
fig.show()
print("prior mean :", float(gausshyper.mean(**prior)))
print("posterior mean :", float(gausshyper.mean(**post)))
prior mean : 0.20216177323713735
posterior mean : 0.28176394627450635
# 10.3 Hierarchical simulation: p ~ gausshyper, y|p ~ Binomial(m,p)
m = 20
N = 10_000
p_gh = gausshyper.rvs(prior["a"], prior["b"], prior["c"], prior["z"], size=N, random_state=rng)
y_gh = rng.binomial(m, p_gh)
# Baseline: Beta prior with same a,b (c=0)
p_beta = rng.beta(prior["a"], prior["b"], size=N)
y_beta = rng.binomial(m, p_beta)
fig = go.Figure()
fig.add_trace(go.Histogram(x=y_beta, histnorm="probability", name="Beta prior", opacity=0.55))
fig.add_trace(go.Histogram(x=y_gh, histnorm="probability", name="gausshyper prior", opacity=0.55))
fig.update_layout(
barmode="overlay",
title=f"Counts y with m={m}: Beta vs gausshyper prior over p",
xaxis_title="y",
yaxis_title="probability",
)
fig.show()
print("Var(y) with Beta prior :", float(np.var(y_beta)))
print("Var(y) with gausshyper prior:", float(np.var(y_gh)))
Var(y) with Beta prior : 13.73694559
Var(y) with gausshyper prior: 9.92441084
11) Pitfalls#
Invalid parameters:
must have (a>0), (b>0), (z>-1); otherwise (1+z x) can become non-positive on ([0,1]).
Boundary behavior:
if (a<1) or (b<1), the PDF diverges at 0 or 1 (still integrable). Avoid evaluating exactly at 0 or 1 in floating point.
Numerical stability:
the normalization constant uses ({}_2F_1), which can overflow/underflow for extreme parameters. Prefer
logpdfwhen fitting.cdfcan be relatively slow because it is computed numerically.
Rejection sampling efficiency:
acceptance can be very low if (M=\max(1,(1+z)^{-c})) is huge (e.g., (z) close to (-1) and (c>0)). In that regime you may need a better proposal than Beta.
12) Summary#
gausshyperis a continuous distribution on ((0,1)) with density (\propto x^{a-1}(1-x)^{b-1}(1+z x)^{-c}).It generalizes the Beta distribution; setting (c=0) (or (z=0)) recovers Beta.
Normalization and moments involve the Gauss hypergeometric function ({}_2F_1).
Raw moments have a closed form ratio of Beta functions and ({}_2F_1) terms; mean/variance follow directly.
A simple NumPy-only sampler uses rejection sampling with a Beta proposal.
SciPy provides
pdf,cdf,rvs, andfit, but be mindful of numerical and performance pitfalls for extreme parameters.
References
SciPy
gausshyperdocstring/sourceArmero, C., & Bayarri, M. J. (1994). “Prior Assessments for Prediction in Queues.” Journal of the Royal Statistical Society: Series D (The Statistician).